Automatically exposing OpenLifeData via SADI semantic Web Services
نویسندگان
چکیده
BACKGROUND Two distinct trends are emerging with respect to how data is shared, collected, and analyzed within the bioinformatics community. First, Linked Data, exposed as SPARQL endpoints, promises to make data easier to collect and integrate by moving towards the harmonization of data syntax, descriptive vocabularies, and identifiers, as well as providing a standardized mechanism for data access. Second, Web Services, often linked together into workflows, normalize data access and create transparent, reproducible scientific methodologies that can, in principle, be re-used and customized to suit new scientific questions. Constructing queries that traverse semantically-rich Linked Data requires substantial expertise, yet traditional RESTful or SOAP Web Services cannot adequately describe the content of a SPARQL endpoint. We propose that content-driven Semantic Web Services can enable facile discovery of Linked Data, independent of their location. RESULTS We use a well-curated Linked Dataset - OpenLifeData - and utilize its descriptive metadata to automatically configure a series of more than 22,000 Semantic Web Services that expose all of its content via the SADI set of design principles. The OpenLifeData SADI services are discoverable via queries to the SHARE registry and easy to integrate into new or existing bioinformatics workflows and analytical pipelines. We demonstrate the utility of this system through comparison of Web Service-mediated data access with traditional SPARQL, and note that this approach not only simplifies data retrieval, but simultaneously provides protection against resource-intensive queries. CONCLUSIONS We show, through a variety of different clients and examples of varying complexity, that data from the myriad OpenLifeData can be recovered without any need for prior-knowledge of the content or structure of the SPARQL endpoints. We also demonstrate that, via clients such as SHARE, the complexity of federated SPARQL queries is dramatically reduced.
منابع مشابه
Merging OpenLifeData with SADI services using Galaxy and Docker
Semantic Web technologies have been widely applied in Life Sciences, for example by data providers like OpenLifeData and Web Services frameworks like SADI. The recent OpenLifeData2SADI project offers access to the OpenLifeData data store through SADI services. This paper shows how to merge data from OpenLifeData with other extant SADI services in the Galaxy bioinformatics analysis platform, mak...
متن کاملEnhanced reproducibility of SADI web service workflows with Galaxy and Docker
BACKGROUND Semantic Web technologies have been widely applied in the life sciences, for example by data providers such as OpenLifeData and through web services frameworks such as SADI. The recently reported OpenLifeData2SADI project offers access to the vast OpenLifeData data store through SADI services. FINDINGS This article describes how to merge data retrieved from OpenLifeData2SADI with o...
متن کاملThe Semantic Automated Discovery and Integration (SADI) Web service Design-Pattern, API and Reference Implementation
BACKGROUND The complexity and inter-related nature of biological data poses a difficult challenge for data and tool integration. There has been a proliferation of interoperability standards and projects over the past decade, none of which has been widely adopted by the bioinformatics community. Recent attempts have focused on the use of semantics to assist integration, and Semantic Web technolo...
متن کاملAutomated Generation of SADI Web Services for Clinical Intelligence using Ruled-Based Semantic Mappings
We present a framework that automates the generation of SADI semantic web services from declarative service descriptions and semantic mappings to relational data. Mappings are specified in a Datalog sublanguage of Positional-Slotted Object-Applicative (PSOA) RuleML. We outline a novel methodology, a system architecture, and a prototype implementation for service generation. A proof-of-concept i...
متن کاملSADI for GMOD: Semantic Web Services for Model Organism Databases
Here we describe work-in-progress on the SADI for GMOD project (SADI: Semantic Automated Discovery and Integration; GMOD: Generic Model Organism Database), a distribution of ready-made Web services that will bring additional model organism data onto the Semantic Web. SADI is a lightweight standard for implementing Web services that natively consume and generate RDF, while GMOD is a widely-used ...
متن کامل